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Abstract — There is a recent surge of interest in developing 
algorithms for finding sparse solutions of underdetermined sys- 
tems of linear equations y — ^x. In many applications, extremely 
large problem sizes are envisioned, with at least tens of thousands 
of equations and hundreds of thousands of unknowns. For such 
problem sizes, low computational complexity is paramount. The 
best studied £i minimization algorithm is not fast enough to fulfill 
this need. Iterative thresholding algorithms have been proposed 
to address this problem. In this paper we want to analyze two 
of these algorithms theoretically, and give sufficient conditions 
under which they recover the sparsest solution. 



I. INTRODUCTION 

Finding the sparsest solution of an underdetermined system 
of linear equations y = ^x, is a problem of interest in 
signal processing, data transmission, biology and statistics- 
just to name a few. Unfortunately, this problem is NP-hard 
and in general can not be solved by a polynomial time 
algorithm. Chen et al. [1] proposed the following optimization 
for recovering the sparsest solution; 



are in J, and xj all the elements of x whose indices are in 
J. The coherence of $ is defined as. 



(Si) 



s.t. ^x — y, 



where £p-norm is defined as — y^~\xi\P. 
Greedy methods have also been proposed as another alter- 
native for solving such a problem. One of the best known 
algorithms of this class is orthogonal matching pursuit (OMP) 
[2]. Intuitively speaking at each iteration, OMP finds a column 
of $ which has the maximum correlation with the error of the 
approximation up to this step, and adds it to the active set and 
projects y onto the range of the active set to get a new estimate. 
The third class of algorithms that has drawn a lot of attention 
recently is the class of iterative thresholding algorithms. This 
class has the least computational complexity and is the most 
suitable class for very large scale problems [3]. There are many 
theoretical results that prove the optimality of the first two 
classes of algorithms under certain conditions, but there are 
much less rigorous results for thresholding algorithms. Before 
mentioning some of the results, we first set up the notation 
we are going to use in the paper Suppose that Xq £ is 
a k sparse vector (i.e. it has at most k non-zero elements). 
We observe the measurement vector y — ^Xo which is in M" 
(n < N) and the goal is to reconstruct the original vector Xo- 
Without loss of generality, we assume that the columns of $ 
have unit £2 norm. Another notation that is used in the paper 
is the notion of restricted submatrices. For a subset of columns 
of $ called J, $7 includes all the columns of $ whose indices 



fi — max I ((/li 

{i,j:l<i,j<N,i^j} 



(1) 



where is the i**^ column of the matrix $. In the following, a 
summary of the results proved for £1 minimization and OMP 
algorithms in [4] and [5] respectively, are presented. 

Theorem 1.1: If fc < i(l + then both the £1 mini- 

mization and the OMP recover the sparsest solution. 

When the matrix $ is drawn from a random ensemble [6], 
[7], we can bound the coherence [8], and find conditions 
for the exact sparse signal recovery. In this random setting, 
however, the results can be improved [9]. Although the the- 
oretical results are basically focused on £1 relaxation and 
greedy methods, many large scale applications have already 
moved toward the thresholding algorithms [10], [11]. In a 
recent paper, we considered a few thresholding policies and 
showed that the results of these algorithms are very impressive 
in practical situations such as compressed sensing [3]. In this 
paper we focus on the theoretical aspects of these algorithms. 
The organization of the paper is as follows. In Section HH we 
discuss the thresholding algorithms and the thresholding policy 
considered in the paper; The main results of the paper will also 
be reviewed. Section |lll] presents the convergence proof of the 
thresholding algorithms. In Section [Vl we will briefly review 
the existing literature on iterative thresholding algorithms and 
compare those results to ours. Finally Section [Vl] concludes 
the paper 

II. Iterative Thresholding Algorithms 

A. Abstracted thresholding Algorithm 

Consider two threshold functions rjt{x) to be applied ele- 
mentwise to vectors: hard thresholding t]^{x) = xl^^^^^^-j 
and soft thresholding rj^{x) = sgn(x)(|a;| — where 1 is 
the indicator function and (a)+ is equal to a if a > 0, and zero 
otherwise. Iterative hard thresholding (IHT) and iterative soft 
thresholding (1ST) algorithms are defined with the following 
iteration. 



(2) 



where At is the threshold value at time t, * E {H,S} 
represents hard or soft thresholding, $^ is the transpose of 
the matrix $ and x* is our estimate at time t. Note that 
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the threshold value may depend on the iteration. The basic 
intuition is that since the solution satisfies the equation y = 
$x, algorithm makes progress by moving in the direction 
of the gradient of \\y ~ ^xjl^ and then by thresholding the 
result, it tries to get a sparse vector closer to the hyperplane 
y — ^x. Another intuition for this algorithm comes from [12] 
and is as follows. Suppose that we want to solve the following 
optimization problem, 

{Vq) min ||y-$a;||2 + 2A||a;||,. 

X 

It has been proved that the following 1ST algorithm converges 
to the solution of Vi in case ||*^$-/||2.2 < 1, 

=77f(a;* + $'^(y-$x*)), (3) 

where ||A||2,2 is the spectral norm of the matrix A. It may 
be noted that A is fixed here and does not depend on the 
iteration. It is also well-known that as A ^ the solution of 
Vi converges to the solution of Qi. But it is easy to see if 

is a fat matrix, setting A to a very small value in (|3]l will 
not work. A proper thresholding policy is to set the threshold 
to a large value and gradually decrease it as the algorithm 
proceeds. The following theorem justifies this intuition. 

Consider the iterative soft or hard thresholding algorithms 
introduced in equation (|2|i. Suppose At ^ as i — > oo, and At 
is a decreasing sequence (this condition may not hold, but for 
the simplicity of the proof we assume it is true). Let Jt denote 
the union of the support of and Xo and define Li := Jt+i U 
Jt- Assume that Lf satisfies, supj ||/— ^'j^^^Lt\\2,2 = 7 < 1. 
Under these conditions: 

Theorem 2.1: The iterative thresholding algorithm will con- 
verge to the sparsest solution. 
Proof: 

\\x'+^^x,h^\\x'+l~Xo,^J\^ 

< 11(4, +*I,(*Lta;oz,, +-*L,4j)+et+i-2;o^J|2, 

< II $lJ(4, -2:0^ Jll2 + v^At+i, 

< II(/-*I,*lJI|2,2||4, -2;oz,J|2 + V^At+i, 

where et+i is an extra error introduced by the thresholding 
process and therefore each element of this vector is less than 
At+i. Also all the elements that are not in Lf are zero. 
Inequality (1) is just the triangle inequality for £2 norm. For 
any e > 0, choose Tq such that y/nXxo+i 

< £(i_:2), and let 

\\x^°+^-Xo\\2 = e. Then, find Ti such that 7^ie < e/2. Now 
it is easy to prove that at t = Tq + Ti, the error is less than e 
and therefore the total error goes to zero. ■ 
This theorem is not useful for practical purposes since we 
should have information on the size of Lf . In the next section 
we mention a practical thresholding policy that may be used 
in practice. 

B. Thresholding Policy 

Suppose that an oracle tells us the true underlying k. Then 
since the final solution is k sparse, the threshold can be set to 
the magnitude of the (A:+ 1)**^ largest coefficient. This type of 



thresholding policy has also been used in [13], [14], [15]. The 
only problem is how to get the oracle information. In a recent 
paper, we showed how one can de-oraclize such algorithms for 
compressed sensing problems [3]. For other types of problems, 
k may be estimated using cross validation. If neither of these 
two methods is applicable, the bounds derived in this paper for 
the sparsity may be used for setting k. From now on, whenever 
we refer to IHT or 1ST, the thresholding policy is the k largest 
element thresholding policy unless otherwise stated. 

C. Main Results 

We will prove two main theorems for the two thresholding 
algorithms that have been mentioned in the last section. 

Theorem 2.2: Suppose that k < jyM^^ and |J'^(°ff]^)| < 
3^'~'*,Vi, 1 < i < k. Then IHT finds the correct active set in 
at most X^iLi ^i + k steps. After this step all of these elements 
will remain in the active set and the error will go to zero 
exponentially fast. 

Theorem 2.3: Suppose that k < jjfJ-^^ and Vi, 1 < i < k, 
we have , ^^^'•^W, < 2^*~^. Then 1ST recovers the correct 

active set in at most X]i=i + ^ steps. After that all these 
coefficients will remain in the active set and the error will go 
to zero exponentially fast. 

The sufficient conditions provided here are slightly weaker 
than the conditions mentioned for £1 or OMR Simulation 
results also confirm that IHT and 1ST are weaker than £1 
in practice [3]. Another interesting fact is that the number 
of iterations needed, depends on the ratio of the coefficients 
but this dependency is roughly logarithmic and therefore it 
will work well in practice. Also, the algorithms find the 
correct active set in a finite number of iterations and once 
the algorithms find the correct active set, they converge to the 
exact solution exponentially fast. 

III. Proof of Convergence for the IHT Algorithm 

The goal of this section is to give an outline of the proof 
of Theorem 12.21 We define the following two variables, 

= x' + $^($j;o - (4) 
vf — Xo ~ x\ (5) 

where Xo is the optimal value and is our estimate at the 
step. The j*'' element of these two vectors will be denoted 
by z^{j) and w^{j). The active set of is called Finally, 
Xo{i) denotes the z*'' element of Xq. Without loss of generality 
we assume that a;o(i)'s are sorted in descending order of their 
absolute values and therefore the only non-zero elements of 
Xo are the first k elements. The next lemma will be useful 
later when we try to bound the error at each iteration. 
Lemma 3.L Consider the following sequence for s > 0, 

= + . . . a" + /3a"+\ 

where < a < 1. The following statements are true; 

1) If (3{l -«)<!, then for every s, fs < jzr^. 

2) If /3(1 - a) > 1, then for every s, fs < fia. 

3) If /3(1 — a) = 1, then fs is a constant sequence and is 
always equal to 
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It is easy to see that the sequence is either increasing or 
decreasing or constant depending on the values of a and /3. 
The proof is simple and is omitted for the sake of brevity. 

Lemma 3.2: Suppose that Xo{l), a;o(2), . . . , Xo{r — 1), r — 
1 < A;, are in the active set at the m"^ step. Also assume that, 

\z"'{j)-x,{j)\ < 1.5kfi\x,{r-l)\ \f3. 

If fc/i < jj, then at stage m + s and for every j we will have 
the following upper bound for \z™~^''{j) — Xo(j)\, 

\xoir)\ {kfi + ... + (kfiY) + 1.5{kfiy+^\x,ir - 1)|. (6) 

Moreover, Xo{l),Xo{'2), . . . ,Xo{r — 1) will remain in the 
active. 

Proof: We prove this by induction; Assuming that the 
bound holds at stage m + s and Xo{l),Xo{2), . . . ,Xo{r — 1) 
are in the active set, we show that the upper bound holds at 
stage m + s + 1 and the first r — 1 elements will remain in 
the active set. 



\z"'+'+'il)~Xo{i)\ 



jei^+-\U} ie{i.2,...fc}\/'"+=u{«} 



E 

ie{r,...fc}\/™ + =>U{i} 



< 



\{<|>„c|,,){z'^+^J)-Xo{J))\ + k^^Xo{r), 



< k^i\xo{r)\ikfi + ... + {k^Iy) + 1.5{kfiy+^\x,{r - 1)1 
+ kn\xo{r)\, 

< \xo{r)\{kfi + ... + {kfir+') + 1.5{kny+^\xo{r - 1)|. 

In these calculations equality (1) is due to the assumptions of 
the induction, i.e. the first 7- — 1 elements are in the active set at 
stage TO + s. To get inequality (2) we have used two different 
facts. The first one is that when j £ w™'^''{j) = Xo{j) — 

z"^~^^{j) and the second one is that when j e {r, . . . fc}\/™"'"'' 
then = Xo{j) and therefore |a;o(j)| < |xo(r)|. The 

last step is to prove that all the first r — 1 elements remain in 
the active set. For i E {1, 2 . . . , r — 1}, 

> - {kfi\xo{r - 1)1 + ... + {kfiy+^\xo{r - 1)|) 



-1.5(M'+'No(r-l)| > \xoii)\ 



>\x,ir-l)\- 



ko(r~l) 
2.05 



l^o(^-l)l 

2.05 



In inequality (1) we have used the bound in (|6]l by replacing 
Xo{r) with Xo{r — 1). Inequality (2) is the result of Lemma 
[Q] For i <^ {1, 2 . . . fc}, we have 



,m+s+l/.-M ^ \^o{f 1) 



2.05 ' 



and since min{j.j<r_i} (i)| > 

max{j.j>;^} |z™+*+-'^(i)|, the first r — 1 elements will 
remain in the active set. The base of the induction is the same 



as the assumptions of this lemma and the proof is complete. 

■ 

Lemma 3.3: Suppose that k < jj/i^^, and 
Xo{l),Xo{2), . . . ,Xo{r), r < fc, are in the active set at 
the TO*h step. Also assume that , '"^f^I^ii < 3^'-^. If 

\z^'U)-Xo{j)\ < l.5k^i\xo{r)\ Vj, 
after £r more steps Xo{r + 1) will get into the active set, and 



(j)-a^o(j)| <1.5Ma;o(r + l)| V j. 



Proof: By setting q = £r in the upper bound we get, 

\z--^^ij) ~ < '-^^^^^^ + ^±^. 

Similar to the last lemma it is also not difficult to see that 

|z"+^-(r + 1)1- k™+''-(r- + 1) - Xo{r + 1) + Xo{r + 1)| 
> \xo{r + 1)1 - |z™+^'-(r + 1) - Xo{r + 1)| 
\l.5xo{r + l)\ \xoir + l) 



> |a;o(r+l)| 



273 



2.1 



But, 



|z"+^'-(r + l)| > max |z"+^'-(i)|, 

{i:i>k} 



and therefore Xo{r+l) will be detected at this step. It may also 
be noted that at this stage the error is less than \xo{r + l)|/2. 
For the next stage we will have at most fc active elements the 
error of each is less than |a;o(r + l)|/2 and at most fc — r non- 
zero elements of Xo that have not passed the threshold and 
whose magnitudes are smaller than | a::o(r + 1)|. Therefore, the 
error of the next step is less than 1.5kiJ,\xo{r + 1)|. ■ 
Our goal is to prove the correctness of IHT by induction and 
we have to know the correctness of IHT at the first stage. The 
following lemma provides this missing step. 

Lemma 3.4: Suppose that fc < jy/i"^, then at the first 
stage of the IHT, a;o(l) will be in the active se{3 and \z^{j) — 
Xo{j)\ < kiJ,\xo{l)l 
Proof: 



> \Xo{l)\-kfl\Xo{l)\. 



On the other. 



max \z^{i)\ = max \y^{(f)i,(f>j)xo{j)\<k^j.\xoil)\. 

{i:k<i} {i:k<i} ^ — ' 

i=i 

Therefore, since fc/z < 1 — fc/i, the index of the first element 
will be in the active set after the first step. The last claim of 
the Lemma is also clear ■ 
Finally the following lemma describes the performance of the 
algorithm after detecting all the non-zero elements. 

Lemma 3.5: Suppose that Xoil),Xo{'2), . . . ,Xo{k), are in 
the active set at the to**^ step. Also assume that, 

\z''^{j)~Xo{j)\<1.5kfi\Xo{k)\ Vj. 

If fc/i < jTj-, then at stage m + s and for every j we will have. 



\z"^+'{j)-x,{j)\<l.5{kfiy+^\xjk)\. 

'This result holds even if fc/i < i. For the sake of consistency with the 
other parts of the proof we state it in this way 
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Since the proof of this lemma is very similar to the proof 
of Lemma [372] it is omitted. Proof: [Outline of the proof 
of Theorem 12.21 The proof is an induction that combines 
the above lemmas. Suppose that Xo(l), a;o(2), . . . , a::o(r) are 
already in the active set. According to Lemma 13.21 all these 
terms will remain in the active set, and according to Lemma 
13.31 after ir steps Xo{r + 1) will also get into the active set. 
In one more step, the error on each element gets smaller than 
l.bkiJL\xo{r + 1)|, and everything can be repeated. Lemma 13741 
provides the first step of the induction. Finally when all the 
elements are in the active set lemma |33] tells us that the error 
goes to zero exponentially fast. ■ 

Since the proof of the convergence of 1ST is very similar to 
IHT we do not repeat it here. You may refer to [16] for more 
details. 

IV. Proof of convergence for the 1ST algorithm 

As mentioned before the main ideas of the proof of the 
1ST algorithm are very similar to those of the IHT. We will 
mention the proof in detail but will try to emphasize more 
on the differences. The following lemma helps us find some 
bounds on the error of the algorithm at each step. 

Lemma 4.1 : Suppose that Xo(l), a;o(2), . . . , Xo(r), r < k, 
are in the active set at the m*'' step. Also assume that 

k™(j)-^o(j)| <4M^o(r)|, Vj e/™, 

and k fi < jj. Then at stage rn + s, V i G jm+s have the 
following upper bound for — Xo{i)\, 

\xoir + 1)1 (2fc^ + . . . + (2fc/i)^) + 2{2kfiy+^\xoir)\. 

Moreover, Xo{l), a;o(2), . . . , Xo{r) remain in the active set. 

Proof: As before, this can be proved by induction. 
We assume that at step m + s the upper bound holds and 
Xo{l),Xo{'2), . . . ,Xo{r) are in the active set and we prove the 
same things for m + s + 1. Similar to what we saw before. 



\z^^+'+'il)-Xoil) 



the first r elements will remain in the active set or not. For 
i E {1, 2 ... r} we have, 

\z"^+^+\^)\>\xo{^)\~\z"'+'+H^)-Xo{^)\, 

> \xoit)\ - kfi\xo{r)\{l + 2kfi +... + (2fc^)^+i) 



2kfii2kfiy+'\xo{r)\ > \xoii)\ 



\^o{r)\ 
2.05 



\xo{r)\ 
2.05 ■ 



(7) 



If the sequence in the above expression is multiplied by 2, 
the result will be a sequence in the form of the sequences 
mentioned in lemma 13.11 for a — 2kfi, (3 — 2 and the last 
equality is based on that lemma. 

If i ^ {l,2...fc}, 

< ktx\xo{r)\{l + 2kfi + ... + i2k^iy+^) 



2k^i{2k^iy+^\xoir)\ < 



2.05 ■ 



Since min{j. .,<,,} 



7n-\-s-\-l 



(z)| > max{,,,>,}|z"+^+i(i)|, the 



first r elements remain in the active set. The base of the 
induction is also clear since it is the same as the assumptions 
of the lemma. ^ ■ 

Lemma 4.2: Suppose that k < and 
Xo{l),Xo{2), . . . ,Xo{r), r < fc, are in the active set at 



< 2 



er-5 



If 



the m step. Also, assume that — , , 

r \xo{r+l)\ 

\x'^U)-XoU)\<ikfi\xoir)\, Vje/™, 
then after £r steps Xo{r + 1) will get into the active set, and 



j-m+er + l 



\x-- ■ - ■ '(j) - Xoij)\ < ikfi\xoir + 1)1, Vj e r 

Proof: As before we try to find a bound for the error at 
time m + £r- For i g {1, 2, . . . , fc}. 



1 



{i)-Xo{i)\ < -\xo{r + ^)\i2k^I + . . . + {2k^iY-) 



i2k^iY-+^\xo{r)\ < 



\xo{r + l)\ , \xoir + l) 



2.1 



64 



< 



)W 



(i)l+ E \{(f>^^(t>j)'w"'^'{j)\, and therefore for 2 = r + 1, 



jei^+-\{t} je{i,2,...fe}\/"+'-u{j} 



|z™+^'-(r + 1)1 >\x,{r + 1)1 - |z'"+^'-(i) - x,{i)\ > 



je{r+l,...fc}\/™+ = U{i} 



ko(r + 1)1 



\xo{r + l)\ |a:o(r+l) 



2.1 



64 



< (fc - l)^l(2k^l\xo{r + 1)1 + . . . + (2fc^)^|xo(r + 1)| 
+ 2{2k^iy+^\xo{r)\) + k^i\xo{r + 1)| := a,. 

Equality (1) is using the assumption that the first r elements 
are in the active set at stage m + s. Inequality (2) is also 
due to the assumptions of the induction and the fact that 

«^™+^(j) =a;o(j)-a^"+'(j)- 

At least one of the largest fc + 1 coefficients of z, corresponds 
to an element whose index is not in {l,2,...fc}, and the 
magnitude of this coefficient is less than a^- Therefore the 
threshold value is less than or equal to a^. Applying the 
soft thresholding to z will at most add to the distance of 
z^^^{i) and Xo{i), and this completes the proof of the upper 
bound. The main thing that should be checked is whether 



(8) 



Since \z"'+^-{r + 1)| > max{,,fe<,} \z'^+^-{i)\, the r + 1*'' 
element will get into the active set at this stage. On the other 
hand for any i G j-m+ir- have \x"^~^^''{i) —Xo{i)\ < Xo{r + 
1). For the next stage of the algorithm we will have at most 
2fc non-zero x™+^'^(i) — Xo{i) and absolute value of each of 
them is less than |a;o(r+l)|. Therefore \z"'-^^''^^{i)~Xo{i)\ < 



2kfi\xo{r + and after thresholding we have. 



Xo{i)\ < 4:k^j.\xo{r + 1)1 for i £ 
The base of the induction is also clear from the assumptions 
of this lemma and the proof is complete. ■ 
For the IHT algorithm we proved that at the first step the first 
element will pass the threshold. Since the selection step of 1ST 
and IHT is exactly the same, we can claim that the same thing 
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is true for 1ST, i.e. the largest magnitude coefficient will pass 
the threshold. Also, as we saw for IHT, the error was less than 
k^\xo{l)\- Therefore, for the 1ST we have, \x^{j) — Xo{j)\ < 
2kfi\xo{l)\- These bounds are even better than the bounds we 
need for gl] and gl] and |43] 

The following lemma will explain what happens when the 
algorithm detects all the non-zero elements. 

Lemma 4.3: Suppose that Xo{l), ■ ■ ■ ,Xo{k), are in the ac- 
tive set at the m*'' step. Also assume that, 

k™a)-^o(j)| <4Ma;o(fc)|. 

If < jj, at stage m + s all the elements remain in the 
active set and for every j we will have, 

k"+^(j)"a;o(j)| <2(2M'+'ko(fc)| 
The proof of this lemma is very similar to the other lemmas 
and is omitted. 

Proof: [Outline of the proof of Theorem 12. 3 II The proof is 
a simple induction by combining the above lemmas. Suppose 
that Xo{l),Xo{'2), ■ ■ ■ ,Xo{r) are already in the active set. 
According to Lemma 14.11 all these terms will remain in the 
active set, and according to Lemma l4!2l after £r steps Xo{r + l) 
will also get into the active set. In one more step, the error on 
each element gets smaller than Akfi\xo{r+l)\, and everything 
can be repeated. Although we have not mentioned the first step 
of the induction it is not difficult to see that step is also true 
and it is very similar to the first step of IHT. Finally when all 
the elements are in the active set lemma 143] tells us that the 
error goes to zero exponentially fast. ■ 

V. Discussion and Comparison With other Work 

There is a huge amount of work on iterative thresholding 
algorithms, and we cannot mention all of them here; The 
interested reader is referred to [3]. Most of these papers 
are dealing with a fixed threshold that does not depend on 
iteration. In that case, there are rigorous results that give 
sufficient conditions for the 1ST algorithm to converge to the 
solution of Vi [12], and for the IHT algorithm to a local 
minimum of Vo [17]. The idea of choosing iteration dependent 
thresholds is also not new, and some simple variations were 
introduced in [11]. The k largest element thresholding policy 
was first introduced in [13] and was first used for IHT in 
[15]. It was also shown that if the $ matrix satisfies restricted 
isometry property (RIP) of order 3k, the IHT converges to 
the sparsest solution. There are some basic differences in our 
approach. First, we are dealing with deterministic settings, 
and in these settings RIP conditions they have provided are 
much weaker than ours (kfi < compared to fc/i < jj). 
Under these more general conditions, as we observed, the 
performance of IHT is not as simple as what is mentioned 
in [15], and it may not recover Xo in just k steps. But it 
will finally recover the sparsest signal and we give bounds 
on the number of iterations it needs to converge. Secondly, as 
discussed in the last section, our approach was easily adapted 
to 1ST, and can be adapted to the other types of thresholds. 
Moreover, our method gives us an ordering among £i, OMR 
IHT and 1ST which may be useful for deciding on the choice of 
the algorithm. Finally there is another effort on analyzing the 



performance of 1ST by coherence that shows the possibility of 
success of such an algorithm at the first iteration [18]. But this 
result does not have any conclusion about the next iterations 
of 1ST in case it does no recover all the non-zero elements at 
the first step. 

VI. Conclusion 

In this paper, we analyzed iterative hard and soft thresh- 
olding, and proved that under certain conditions they work 
properly. These conditions are slightly weaker than their coun- 
terparts for ii and OMR But these algorithms are very simple 
to implement and much faster than both convex relaxation and 
greedy methods, and they are much more desirable for large 
scale problems. 
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